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ABSTRACT 


The  overall  test  for  lack  of  fit  in  autoreqressive-movi ng  average  models 


proposed  by  Box  and  Pierce  (1970)  is  considered.  It  ic  shown  that  a substan 


tially  improved  approximation  results  from  a simple  modification  of  this  test 


Some  consideration  is  given  to  the  power  of  such  tests  and  their  robustness 


when  the  innovations  are  non-normal.  Similar  modifications  in  the  overall 


tests  used  for  transfer  function-noise  models  are  proposed 
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SIGNIKICANCK  AND  EXPLANATION 

Very  many  physical  situations  are  described  by  a series  of  numbers  occurring 
sequentially  in  time,  the  observed  numbers  arising  from  a deterministic  under- 
lying smoothly  varying  sequence  perturbed  by  random  errors.  Typical  examples 
are  observations  of  the  position  of  a vehicle  or  missile  at  successive  intervals 
in  time,  and  many  other  examples  occur  in  connection  with  engineering  production 
and  business  management. 

Such  time-series  are  normally  analyzed,  after  allowing  for  the  deterministic 
part,  by  assuming  some  underlying  model  for  the  process  involved,  the  relevant 
model  in  the  present  work  being  an  autoregressive  moving  average  model  described 
in  the  first  paragraph  of  the  j^aper.  This  is  the  model  on  which  the  Box-Jenkins 
technique  is  based. 

If  the  parameters  in  a model  are  estimated  from  an  exi)er imental  time-series, 

the  next  question  is  the  adequacy  of  fit  of  the  data  by  the  mode).  in  this 

connection  it  is  useful  to  study  the  residuals  and  their  autocorrelations  r.  . 

k 

Box  and  Pierce  (1970)  propose  an  overall  test  for  lack  of  fit  based  on  approx- 

m _2  2 

imating  the  distribution  of  Q(f)  = n 7 r.  by  the  v distribution  where 

kfl 

n is  the  number  of  observations  in  the  time  series,  m is  the  number  of  lags, 

and  p + q is  the  number  of  parameters  in  the  mode] . Recent  studies  have  shown 

that  this  approximation  is  not  adequate  unless  n is  large  relative  to  m. 

The  present  paper  illustrates  that  the  use  of  the  modified  test  statistic 
' - -1-2 

Q(r)  = n(n+2)  ^ (n-k)  leads  to  a substantially  improved  approximation . 

k=l  ^ 

The  power  of  the  overall  test  and  its  robustness  to  non-normality  of  the  innova- 
tions (i.e.  random  perturbations  on  the  data)  in  the  model  are  discussed  briefly. 
Some  consideration  is  also  given  to  testing  for  lack  of  fit  in  transfer  function- 
noise  models,  where  one  has  two  time  series,  variations  in  one  series  being 


related  to  variations  in  the  other. 


ON  A MEASURE  OF  LACK  OF  FIT  IN  TIME  SERIES  MODELS 


G.  M.  Ljung  and  G.  E.  P.  Box 


1 . INTRODUCTION 

Consider  a discrete  time  series  {w^}  generated  by  a stationary  auto- 
regressive moving  average  model 

«(B)w^  - e(B)a^  (1.1) 

where  ^ (B)  = 1-^  B-...-0  B^,  6(B)  ■ l-0,B-...-0  B*^,  B^'w  » w , , {a  ) is 
Ip  1 q t t-){  t 

2 

a sequence  of  independent  and  identically  distributed  N(0,o  ) random 

deviates,  and  where  the  can  represent  the  d-th  difference  or  some  other 

suitable  transformation  of  a nonstationary  series  {z^}. 

After  a model  of  this  form  has  been  fitted  to  a series  w, , . . . ,w  , it 

1 n 

is  useful  to  study  the  adequacy  of  the  fit  by  examining  in  various  ways  the 
residuals  aj^,...,a^  and,  in  particular,  their  autocorrelations 

n 
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^)c  = 


t=k+l 


^t^t-lc 


K — 1/2, ••• 


V '2 

^ ^t 
t=l 


An  informal  graphical  analysis  of  these  quantities  combined  with  over- 
fitting [see,  for  example.  Box  and  Jenltins  (1970) ) usually  proves  most 
effective  in  detecting  possible  deficiencies  in  the  model.  In  addition, 
however,  it  is  often  worthwhile  to  look  at  an  overall  criterion  of  adequacy 
of  fit.  Box  and  Pierce  (1970)  noted  that  if  the  model  were  appropriate  and 
the  parameters  were  known,  the  quantity 
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(1.2) 


where 


Q(r)  = n(n+2)  ^ (n-k)~r^  , 

k»l  ^ 


y a a 
^ t"  t-*l( 

t=k+l  ^ ^ 


k n , ' 

1 \ 

t-l 
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would  for  large  n be  distributed  as  x since  the  limiting  distribution  of 

on 

r = (r  ,...,r  ) is  multivariate  normal  with  mean  vector  zero  (Anderson  (1942), 
X on 

Anderson  and  Walker  (1964)1,  Varlrj^)  = (n-k)/n(n+2)  and  Cov(r|^,r^)  ■ 0, 
k ^ 1.  Using  the  further  approximation  Var(rj^)  ■ 1/n,  Box  and  Pierce 
suggested  that  the  distribution  of 

= n I Tv 

k=l 

2 

could  be  approximated  by  that  of  Xj^.  Furthermore,  they  showed  that  when  the 
p + q pjarameters  of  an  appropriate  model  are  estimated  and  the 
replace  the  then 


Q(r)  = n y 


would  for  large  n be  distributed  as  x yielding  an  approximate  test  for  lack  of  fit . 

m-p-q 

In  applications  of  this  test,  suspiciously  low  values  of  Q(r)  have 

sometimes  been  observed  and  studies  by  Ljung  and  Box  (1976)  and  Davies  et  al . 

2 

(1977)  have  verified  that  the  distribution  of  Q(r)  can  deviate  from  x 

m-p-q 

This  observation  was  also  made  by  Prothero  and  Wallis  (1976,  Discussion) . 

The  observed  discrepancies  could  be  accounted  for  by  several  factors  among 
these  departures  from  normality  of  the  autocorrelations.  It  appears,  however, 
that  the  main  difficulty  is  caused  by  the  approximation  of  (1.2)  by  (1.3). 


I ■ 


A modified  test  based  on  the  criterion 


m 


Q(r)  «>  n(n+2)  I (n-k)~^r^ 
k-1 

was  recotnmended  by  Ljung  and  Box  (1976)  but  its  usefulness  was  questioned 

by  Davies  et  a^.  (1977)  on  the  ground  that  the  variance  of  Q(r)  exceeds 
2 

that  of  the  x distribution.  Our  studies  show  however  that  the  modified 

m-p-q 

test  provides  a substantially  improved  approximation  that  s)iould  be  adequate 
for  most  practical  purposes. 


2.  MEANS  AND  VARIANCES  OF  Qir)  AND  Q(r) 

To  examine  the  overall  test,  it  is  useful  to  initially  consider  the 

quantities  Q(r)  and  Q(r)  which  involve  the  white  noise  autocorrelations 

Since  the  limiting  distribution  of  r is  N(0,n  ),  Q(r)  and  Q{r)  are 

' ' m 

2 

asymptotically  distributed  as  Xj„  have  expectation  m and  variance  2m. 

For  finite  values  of  n,  Qir)  has  expectation  m,  whereas 

m 


.... . . r . 2,  mn  ,,  m + 1, 

E5(r)  . „ I {1  - 

k=l 


(2.1) 


u « 


2 2 

where  for  fixed  n,  Cov(rj^,rj)  if*  non-zero.  The  univariate  and  bivariate 

moments  of  the  r.  *s  needed  to  evaluate  (2.2)  can  be  obtained  using  the 
k 


identity 


i 1.  Vt-k’*'*  Vt-i'’ 

(r.  r:)  - 


2,i*1 
Ed  a^) 

which  follows  from  independence  of  the  ^ (see,  for  example. 

Anderson  (1971).  p.  304).  Taking  Var(a^)  - 1 without  loss  of  generality. 

E a^  is  distributed  as  ^ ^ * n(n-t2)  ...  (n't2i'*-2j-2)  . The 

t n t 

term  in  the  numerator  of  (2.3)  can  be  evaluated  by  multiplying  term  by  term 
taking  the  expected  value.  Using  this  procedure,  it  can  be  verified  that  for 
k < n/2 


, 2 6(3n-5k)  + 3(n-k)‘ 

” n(n+2)  (n+4)  (n+6) 


2 2 
n (n+2) 


2 2 (n-k) (n-f)  + 4(n-f)  + 8(n-k-t)  (n-k) (n-<) 

Cov(r^'rj^)  = n(n+2)  (n+4)  (n+6)  " 2 .^^.2 

n (n+z) 


The  exact  variances  of  Q(r)  and  0(r)  are  readily  evaluated  using  (2.2) 
and  (2.4)  . By  ignoring  terms  of  order  higher  tiian  1/n  it  may  be  shown  that 
approximately,  for  n large  relative  to  m, 

VarO(r)  = 2m{l  + 

n 


VarO(r)  = 2m{l  + -}  • 

n 

The  variance  of  Q(r)  exceeds  2m  but  the  absence  of  a location  bias 

2 

makes  its  distribution  much  closer  to  x than  that  of  Q(r) . This  is 

m 

illustrated  in  Figure  1 which  compares  Monte  Carlo  distributions  of  Q(r) 
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^ 2 
and  Q(r)  based  on  1000  replications  to  the  x distribution  for  m » 30 

CD 

and  n » 100.  The  means  and  variances  of  the  observed  distributions  are 
) Q(r)  = 24.97,  0(r)  - 30.17,  ■ 60.47  and  " 88.25  and  agree  quite 

I closely  with  the  corresponding  theoretical  values  24.85,  30.00,  63.15  and 

91.48.  Also  shown  by  dashed  lines  in  Figure  1 is  a distribution  of  the  form 
2 

ax.  for  which  both  the  mean  and  variance  is  adjusted  to  correspond  with  that 
b 

of  Qir) . It  is  seen  that  there  is  perhaps  a somewhat  better  agreement  in 
the  upper  tail  area  but  the  main  improvement  results  from  adjusting  the  mean. 

3.  THE  TEST  STATISTICS  P(r)  AND  0(r) 

Box  and  Pierce  (1970)  showed  that  the  residual  autocorrelations 

r = (r-,...,r  )*  from  a correctly  identified  and  fitted  model  can  to  a 
'1  m 

close  approximation  be  represented  as 

r (I  - D)r  (3.1) 

where  I - D is  an  idempotent  matrix  of  rank  m-p-q.  Using  this  relationship, 
the  expectation  of  Q(r)  is 

EQ(r)  =*■  E{nr'(I-D)r)  - tr{n(I-D)C)  , 
where  C is  the  exact  covariance  matrix  of  r.  The  matrix  D has  its 
largest  elements  in  the  upper  left  corner  with  the  remaining  elements  d^^ 
decreasing  to  zero  as  i and/or  j increases.  The  matrix  DC  is  therefore 
nearly  equal  to  n ^D.  Using  this  approximation  and  noting  that  E0(r)  * tr(nC), 
we  have 

EQ(r)  =*  EC)(r)  - p-q  . (3.2) 

Combining  (2.1)  and  (3.2),  the  expected  value  of  Q(r)  is  approximately 

EQ(r)  ^ fl  - - P-q  . (3.3) 

n + 2 2n 
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which  indicates  that  the  distribution  of  Q(r)  can  deviate  markedly  from 
2 

X unless  n is  large  relative  to  m.  However,  using  the  same  approxima*' 

m-p-q 

tions  it  can  be  shown  that 

E6(r)  EQ(r)  - p-q 
■ m-p-q  . 

It  may  be  expected  therefore  that  the  distribution  of  Qir)  might  be 

2 

approximated  by  the  Xj^_p_q  distribution. 

The  adequacy  of  this  approximation  was  questioned  by  Davies  e^  al^.  (1977) 
on  the  ground  that  the  variance  of  0(r)  exceeds  2 (m-p-q).  However,  results 
from  a simulation  study  reported  in  the  next  section  suggest  that  the  reduction 
in  the  location  bias  results  as  before  in  a markedly  improved  approximation 
that  should  be  adequate  for  most  practical  purposes.  It  also  appears  that  the 
expression  for  the  variance  given  by  Davies  ejt  al_. , which  is  not  exact,  over- 
estimates the  variance  of  Q(r) . For  example,  for  fitting  a first  order 
autoregressive  model  to  white  noise,  Davies  et  al . obtain  for  m * 20,  n ■ 50, 
100  and  200,  VarQ(r)  = 58.80,  50.08  and  44.20,  respectively,  vdiile  our 
study  gives  VarQ(r)  = 46.84,  43.20  and  41.97,  respectively. 


4.  SOME  NUMERICAL  RESULTS 


Comparison  of  the  Overall  Tests 

A Monte  Carlo  study  was  conducted  by  generating  4000  sets  of  observations 

{w, ,...,w  } from  the  first  order  autoregressive  model  w - 0w  , = a.  , 
In  t t~l  t 

estimating  0 by  the  approximate  maximum  likelihood  estimator 

n 


0 


n - 2 
n - 1 


K 


I 

t=2 


v't'-t. 


1 


n-1 

I 


t=2 


w 


2 

t 


1 
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[Box  and  J*»nkins  (1970),  p.  279),  and  calculatinq  autocorrelations  of  the 
residuals  a^  « (1-^  ^’*^1'  ^t  * ^t  ” ^'^t-1'  ^ * 2'*  statistics  ^(r) 

and  Q(r)  were  then  calculated. 


Table  1 shows  the  proportion  of  Q(r)  and  values  exceeding  the 

upper  i)  5,  ii)  10  and  iii)  25  percentage  points  of  the 

IB*  1 

distribution  for  a few  combinations  of  n and  m and  for  ♦ ■ .5.  The 
table  also  gives  the  means  and  variances  of  the  observed  distributions.  It 
seems  clear  that  although  the  variance  of  Q(r)  exceeds  2(m-l)  a test  based 
on  this  statistic  would  for  smaller  sample  sizes  provide  a considerable 
improvement  over  the  previously  used  Q(r)  test. 


An  Alternative  Test  Based  on  Q(r) 

The  alxjve  results  suggest  that  a closer  approximation  to  the  distribution 
of  Q(r)  should  be  obtainable  by  appropriately  adjusting  thr  mean  of  the 
approximating  distribution.  Furthermore,  Table  1 shows  values  of  VarO(r) 
which  are  nearly  twice  the  mean,  suggesting  the  approximation 


Q(r) 


2 

V ^ 

*E0(r) 


f with  EC>(r)  given  by  (3.3).  Empirical  significance  levels  obtained  using 

I this  approximation  and  the  criterion  Q(r)  are  compared  in  Table  2.  The 

agreement  is  guite  close. 


A Power  Calculation 

The  two  criteria  Q(r)  and  Q(r)  differ  essentially  in  the  weighting 
which  is  applied  to  the  autocorrelations  r^^,  Q(r)  giving  more  emphasis  to 
later  autocorrelations.  This  would  perhaps  be  an  advantage  if  serial  correla- 
tion occurs  at  high  lags  )< . However,  for  large  n this  difference  should 


fi 
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Table  1.  Empirical  means,  variances  and  significance  levels  of  the  statistics 


Q(r) 

and  Q(r); 

data  generated 

from  the 

model 

■ •^^-l  “ 

n 

m 

mean 

variance 

Level 

5.0  10.0 

25.0 

mean 

variance 

Level 

5.0  10.0 

25.0 

50 

10 

7.48 

13.79 

2.3 

4.7 

13.4 

8.82 

19.11 

5.3 

9,5 

23.0 

20 

13.96 

27.50 

1.3 

2.3 

6.4 

18.58 

47.76 

6.1 

10.4 

23.2 

100 

10 

8.14 

16.04 

3.4 

7.0 

18.2 

8.83 

18.88 

5.0 

9.9 

23.1 

20 

16.26 

35.45 

2.5 

5.0 

13,1 

18.63 

46.46 

5.8 

10.2 

22.8 

30 

23.53 

55.74 

1.7 

3.6 

9.1 

28.58 

81,71 

7,2 

11,6 

23.4 

200 

10 

8.57 

16.76 

4.2 

8.3 

21.5 

8.92 

18,16 

5.0 

9.8 

23.9 

20 

17.46 

36.36 

3.5 

6.9 

17.6 

18.66 

41.51 

5.4 

10,0 

22.7 

30 

26.11 

56.01 

2.9 

5.6 

14.2 

28.66 

67.37 

5.9 

10.5 

23,8 

Table  2.  Empirical  siqnificance  levels  based  on  the  approximations 


data  qenerated  from  the  model 


Eg(r) 


Ep(r) 


9.3 

23.4 

9.4 

24.0 

9.9 

25.4 

9.9 

24.2 

9.6 

24.0 

9.8 

23.9 

5.9 

10. 1 

22.5 

5.9 

10.2 

22.5 

6.1 

10.4 

23.2 

6.7 

11.3 

24.0 

7 .9 

12.8 

25.7 

5.9 

10.0 

22.7 

6.0 

9.8 

23.1 

6.0 

10.1 

22.9 

6.2 

10.3 

23.2 

7.0 

11.2 

24.1 

5.5 

10.2 

23.2 

5.4 

10.1 

22.8 

5.4 

10.0 

22.7 

5.3 

10.5 

22.8 

be  rather  small.  If  the  type  of  discrepancies  to  be  expected  is  known,  tests 
specifically  aimed  at  detecting  these  discrepancies  should  be  used.  Such 
specific  tests  will  of  course  be  much  more  powerful.  This  point  is  illustrated 
in  Table  3 which  empirically  compares  the  power  of  the  overall  tests  and  the 
method  of  "overfitting"  [Box  and  Jenkins  (1970)).  The  results  are  based  on 
data  generated  from  a second  order  autoregressive  model,  with  a first  order 
model  being  fitted  to  obtain  Qir)  and  Q(r).  As  might  be  expected,  the 
overall  tests  are  much  less  powerful  than  overfitting  which  tests  the  hypothesis 
that  the  second  order  autoregressive  coefficient  is  zero.  A smaller  value  of  m 
improves  the  power  of  the  overall  tests  for  this  particular  alternative. 

Effect  of  nonnormality  of  the  3^.'® 

In  developing  the  overall  test,  the  assumption  is  made  that  the  innova- 
tions a^  in  the  model  are  normally  distributed.  Circumstances  occur  where 
this  assumption  is  not  true.  For  example,  it  is  known  that  stock  price 
innovations  often  have  highly  leptokurtic  distributions.  While  Anderson  and 
Walker  (1964)  show  that  the  asymptotic  normality  of  the  does  not  require 

normality  of  the  a^'s,  only  that  their  variance  exists,  results  for  finite 
sample  sizes  are  lacking.  An  empirical  investigation  was  therefore  conducted 

into  the  behavior  of  the  statistic  Q(r)  when  the  a..’s  have  i)  a double 

t 

exponential  and  ii)  a uniform  distribution.  The  results,  which  are  given  in  | 

Table  4,  agree  closely  with  those  obtained  under  the  normality  assumption  for 

the  a^.’®  Table  1.  j 

] 

J 

5.  EXTENSION  TO  TRANSFER  FUNCTION-NOISE  MODELS  ! 

t 

To  check  the  adequacy  of  a fitted  model  of  the  form  ^ 

.] 

w(B)  . e(B) 

''t  (S(B)  "t-b-l  <f>  (B)  ®t  I 
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Table  3.  Empirical  power  of  the  overall  tests  and  the  method 
of  overfittin'^  for  n - 100.  Assumed  model: 

true  model:  (I  - .7B)  (1  - » a^. 

Nominal  significance  level:  5 percent. 

I 


Over fitting 
_ 2 


g(r)  ~ X 


EgCr) 


10 

4.7 

6.7 

28.6 

72.0 

96.6 

99.9 

20 

5.6 

7.3 

24.4 

62.8 

93.7 

99.7 

30 

6.0 

7.7 

22.9 

58.1 

91.7 

99.5 

0(r)  ~ x' 


10 

4.9 

7.0 

28.9 

71  .6 

96.2 

99.9 

20 

6.2 

8.0 

24.7 

61.7 

93.2 

99.6 

30 

7.0 

9.0 

23.7 

57  .0 

90.5 

99.3 

Table  4 . Empirical  means,  variances  and  significance  levels  of  Q(r)  when  the 
innovations  a^  have  i)  a double  exponential  and  ii)  a uniform 

distribution;  data  generated  from  the  model  w - .5w^  , = a . . 

t t-1  t 


double  exponential 


ii)  a^  ~ uniform 


n 

m 

mean 

variance 

5.0 

Level 

10.0 

25.0 

mean 

variance 

5.0 

Level 

10.0 

25.0 

50 

10 

8.50 

18.59 

4.7 

8.6 

20.7 

9.01 

19.35 

5.6 

10.0 

24.4 

20 

17.77 

47.00 

5.4 

8 .8 

19.6 

18.95 

52.39 

7.3 

12.1 

24.3 

100 

10 

8.80 

18.70 

5.0 

9.1 

22.4 

9.11 

19.41 

5.5 

10.8 

25.3 

20 

18.37 

43.62 

4.8 

9.2 

22.0 

19.00 

47.52 

6.4 

11.5 

25.7 

30 

27.94 

76.60 

6.3 

10.1 

21.9 

28.98 

81  .72 

7.5 

12.4 

25.3 

200 

10 

8.86 

18.78 

4.9 

10.3 

23.9 

9.00 

19.24 

5.6 

10. 1 

25.7 

20 

18.60 

43.33 

5.6 

9.7 

23.6 

18.93 

43.99 

6.2 

11.2 

25.2 

30 

28.46 

69.67 

6.0 

10.5 

23.2 

28.94 

72.29 

6.6 

1 1 .6 

25.3 

where 


(d(B)  = u)„  - u),B  - •••  - u)  B 
0 1 u 

6(B)  = 1 - 6^B  - ...  - 6^b'' 

and  ^ (B)  and  6(8)  are  as  in  (1.1)/  and  where  the  input  series  (a^) 

is  assumed  to  be  white  noise,  it  is  useful  to  examine,  in  addition  to  the 

residual  autocorrelations  r,  , the  crosscorrelations  between  the  residuals 

)( 

and  the  input  series 


y a a 


? 2 ? ' 


I < I 

t=l  t=l 


Pierce  (1968,  1972)  and  Box  and  Jen)cins  (1970)  propose  an  overall  test 
for  lac)c  of  fit  in  the  transfer  function  ai(B)/6(B)  based  on  approximating 
the  distribution  of 


S(r  ) = n I (r.  ) 


by  a Y distribution.  However,  arguing  as  above  it  appears  that  a 

m-v-u 

criterion  of  the  form 

'a*  2 ^ -1  **  2 

S(r  ) = n I (n-k)  (r  ) 
k=0 

might  be  more  appropriate.  This  is  suggested  by  the  fact  that  the  distribution 


It  O • 1 ^ ? 

S(r  ) = n I (n-k)  (r  ) , 


where 
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